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Hasenbusch has proposed splitting the pseudo-fermionic action into two parts, in order to speed-up Hybrid 
Monte Carlo simulations of QCD. We have tested a different splitting, also using clover-improved Wilson fermions. 
An additional speed-up between 5 and 20% over the original proposal was achieved in production runs. 



1. INTRODUCTION 

Hybrid Monte Carlo (HMC) [T] is the standard 
algorithm employed in numerical simulations of 
full QCD. However, the computational cost of 
such simulations grows rapidly with decreasing 
quark mass. At light quark mass (a) the condition 
number of the fermion matrix increases, which 
leads to an increased number of iterations in solv- 
ing the corresponding system of linear equations, 
(b) the acceptance rate decreases, which has to be 
compensated by decreasing the integration step 
size, and (c) the autocorrelation time in units of 
trajectories increases. 

In Hasenbusch has proposed numerical 
methods to improve conditions (a) and (b) in or- 
der to accelerate HMC simulations with dynam- 
ical fermions. He suggested splitting the fermion 
matrix into two pieces both having a smaller con- 

* Poster presented by H. Stiiben at Lattice 2003. 



dition number than the original matrix. For each 
factor a pseudo-fermionic field is introduced and 
the Yang-Mills and fermionic parts of the action 
are put onto different time-scales in the leap-frog 
integration. These methods were tested in simu- 
lations with clover-improved Wilson fermions and 
a speed-up of 2 was obtained [3] . The acceleration 
is greater at lower quark masses @]. 

The multiple time-scale approach was initially 
advocated in |S] where Yang-Mills and pseudo- 
fermionic terms were put onto different time- 
scales. The idea was refined in [7] where the fol- 
lowing criteria for an efficient splitting of the ac- 
tion S = Suv + Sir were formulated. The force 
term generated by Suv should be cheap to com- 
pute compared to Sir,. And the splitting should 
mainly capture the high-frequency modes of the 
system in Suv a- n d the low-frequency modes in 
Sir,. In order to achieve this, a low-order poly- 
nomial approximation for mimicking the high- 
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frequency modes was introduced in the fermionic 
action and the action was split accordingly [7|- 

In this study the aforementioned methods are 
combined. The fermion matrix is split according 
to Following [7j the two fermionic contribu- 
tions are put onto different time-scales (this pos- 
sibility was already mentioned in [2] but no ad- 
ditional advantage was found). In a production 
run we compared our splitting with the splitting 
of |3] and found an additional speed-up of about 
20% .;5|. Here we report on the same comparison 
for a run at smaller quark mass. 

2. NOTATION, TECHNICAL DETAILS 

2.1. Actions 

We simulated two flavour QCD with clover- 
improved Wilson fermions employing even/odd 
preconditioning. The standard action for this 
model reads 

S a [U,<f>\<f>] = S G [U] + S dct [U] + t (Q t Q)"V (1) 

where Sa [U] is the standard Wilson plaquette ac- 
tion, (jr and <j) are pseudo-fermion fields, and 

S det [U] = -2Trlog(l + T 00 ), (2) 
Q=(l + T) ee -M eo (l+T)- 1 M oe . (3) 
T ee (T 00 ) is the clover matrix on even (odd) sites 

(T) aa , b0 (x)= l -c sw na^^l(x). (4) 

M eo and M oe are Wilson hopping matrices con- 
necting even with odd and odd with even sites, 
respectively 

The standard action is modified ,2; by intro- 
ducing an auxiliary matrix W = Q + p, p € R, 
and pseudo-fermion fields \K X 

S 1 [U,cf>\ ( f>,x\x] = S G lU} + S det [U] 

+ ^W(Q^Q)- 1 W^ + xHW^W)- 1 x- (5) 

2.2. Multiple time-scales 

One step of the reversible integrator V n we used 
is given by 

V n (r) = V m (J) x (6) 

M£)MDM£)]" x MD 



Vq(t) 




+ tP 


Vuv(t) 


P -> P 


- rdSuv 


V m (r) 


P -> P 


- TdSjR 



where n is a positive integer and the time-scales 
are r and r/n. The effect of Vq, Vuv, Vir on the 
system coordinates {P, Q} is: 

(7) 
(8) 
(9) 

2.3. Splittings of the actions 

We performed simulations employing three 
splittings. The first splitting is based on So QJ. 
The other two are different splittings of Si (J5J. 

Splitting A (Sexton and Weingarten jS|): 
Suv — Sg[U] 

Sir = S'dot[C/] + ^ t (Q t Q)"V (10) 
Splitting B (Hasenbusch and Jansen ): 
Suv — Sg[U] 

Sir = S Act [U] + ^W{Q^Q)- l W^ 

+ xHw^wy 1 x (ii) 

Splitting C (our proposal [S]): 

s uv = SaW + Sdctm + xHw^wy'x 

Sm = c^W{Q^Q)- l W^ (12) 

Our proposal i|12|) is motivated by the hypoth- 
esis that most of the high-frequency modes of the 
pseudo-fermion part of the action JSJ) are located 
in x}{W^W)~ 1 x- We also put the clover determi- 
nant Sdet[U] on the "ultraviolet" time-scale be- 
cause the force generated by it is computation- 
ally cheap. The computationally expensive term 
4tW(Q*Q)- 1 Wi(l> is put on the "infra-red" time- 
scale. 

2.4. Solver 

The standard conjugate gradient algorithm 
was used. Starting vectors were obtained from 
chronological inversion [H] with iVguess = 7. We 
checked reversibility by forward and backward 
integration starting with thermalised configura- 
tions, whereupon deviations of energies were less 
than 10~ 10 . 

2.5. Computational gain 

The CPU-cost is roughly given by icpu °c 
(Nq + N\y)Tint where Nq and Nw are the num- 
bers of multiplications (per trajectory) with Q^Q 
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Table 1 

Parameters and statistics. (Statistics for each parameter set in Table 2.) 



run V (3 k c sw m^/irip trajectory length statistics 



(I) 


16 3 x 32 


5.29 


0.13550 


1.9192 


w 0.7 


1 


300 trajectories 


(II) 


24 3 x 48 


5.25 


0.13575 


1.9603 


« 0.6 


0.5 


100 trajectories 


Table 2 




















Further parameters and perfo 


finance r 


esults. 


(A^teps IS 


the number of integrator steps !©•) 




run 


splitting 


P 


n iVstcps 


-face 


Nq 


N w 


Nq + Nw 


^gain 


(I) 


A 





3 


140 


0.601 


139492 





139492 


l 




B 


0.5 


3 


100 


0.599 


65951 


5233 


71184 


1.95 






0.2 


3 


70 


0.664 


47214 


7378 


54592 


2.82 




C 


0.5 


3 


50 


0.547 


45160 


7687 


52847 


2.40 






0.2 


3 


40 


0.663 


32659 


12373 


45032 


3.42 


(II) 


A 





3 


180 


0.780 


267363 





267363 


1 




B 


0.2 


3 


90 


0.891 


89517 


3242 


92759 


3.29 






0.1 


3 


90 


0.871 


66432 


5786 


72218 


4.13 




C 


0.2 


3 


50 


0.799 


74002 


7967 


81969 


3.34 






0.1 


3 


50 


0.896 


57018 


13624 


70642 


4.35 



to know how the number of integrator steps and 
the trajectory length influence the gain. 
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and W^W, respectively. In order to estimate the 
computational gain we assume T mt oc 1/P a cc EH 
and calculate the computational gain of splittings 
B and C compared to A by 

N W p (B,G) 

n( B '°) = Q Facc (in 

^gain (B,C) M (B,C) p (A) ' 

JV Q T JV VV -face 

3. RESULTS 

We have tested splittings A, B and C in two 
production runs. The parameters of the runs are 
listed in Table 1. Performance results are shown 
in Table 2. The values for run (I) are old results 
[5]. The values for run (II) are new. One sees that 
the speed-up is considerable and that it grows 
with decreasing quark mass, p has to be lowered 
at smaller quark masses. In run (I) splitting C 
accelerates the simulation by about 20% better 
than splitting B. In run (II) the additional gain 
of using splitting C is only about 5%. 

In conclusion, the methods proposed by Hasen- 
busch work very well. Our variant of his method 
seems to perform even slightly better. In both 
cases the choice of the new parameter p affects 
the speed-up noticeably. It would be interesting 



